Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games Using Baselines
نویسندگان
چکیده
منابع مشابه
Monte Carlo Sampling for Regret Minimization in Extensive Games
Sequential decision-making with multiple agents and imperfect information is commonly modeled as an extensive game. One efficient method for computing Nash equilibria in large, zero-sum, imperfect information games is counterfactual regret minimization (CFR). In the domain of poker, CFR has proven effective, particularly when using a domain-specific augmentation involving chance outcome samplin...
متن کاملSearch in Imperfect Information Games Using Online Monte Carlo Counterfactual Regret Minimization
Online search in games has always been a core interest of artificial intelligence. Advances made in search for perfect information games (such as Chess, Checkers, Go, and Backgammon) have led to AI capable of defeating the world’s top human experts. Search in imperfect information games (such as Poker, Bridge, and Skat) is significantly more challenging due to the complexities introduced by hid...
متن کاملOnline Monte Carlo Counterfactual Regret Minimization for Search in Imperfect Information Games
Online search in games has been a core interest of artificial intelligence. Search in imperfect information games (e.g., Poker, Bridge, Skat) is particularly challenging due to the complexities introduced by hidden information. In this paper, we present Online Outcome Sampling, an online search variant of Monte Carlo Counterfactual Regret Minimization, which preserves its convergence to Nash eq...
متن کاملEfficient Nash equilibrium approximation through Monte Carlo counterfactual regret minimization
Recently, there has been considerable progress towards algorithms for approximating Nash equilibrium strategies in extensive games. One such algorithm, Counterfactual Regret Minimization (CFR), has proven to be effective in two-player zero-sum poker domains. While the basic algorithm is iterative and performs a full game traversal on each iteration, sampling based approaches are possible. For i...
متن کاملSupplemental Material for Monte Carlo Sampling for Regret Minimization in Extensive Games
The supplementary material presented here first presents a detailed description of the MCCFR algorithm. We then give proofs to Theorems 3, 4, and 5 from the submission Monte Carlo Sampling for Regret Minimization in Extensive Games. We begin with some preliminaries, then prove a general result about all members of the MCCFR family of algorithms (Theorem 18 in Section 6). We then use that result...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the AAAI Conference on Artificial Intelligence
سال: 2019
ISSN: 2374-3468,2159-5399
DOI: 10.1609/aaai.v33i01.33012157